Skip to content

[responsesAPI][bugfix] serialize harmony messages#26185

Merged
yeqcharlotte merged 7 commits intovllm-project:mainfrom
qandrew:fix-serialize-harmony
Oct 7, 2025
Merged

[responsesAPI][bugfix] serialize harmony messages#26185
yeqcharlotte merged 7 commits intovllm-project:mainfrom
qandrew:fix-serialize-harmony

Conversation

@qandrew
Copy link
Copy Markdown
Contributor

@qandrew qandrew commented Oct 3, 2025

Purpose

harmony messages are not being serialized properly, this was the state prior to this PR.

context.parser.messages[0]
Message(author=Author(role=<Role.ASSISTANT: 'assistant'>, name=None), content=[TextContent(text='We need to respond as ChatGPT. The user says "Hello." We respond politely. Possibly ask how can help.')], channel='analysis', recipient=None, content_type=None)
context.parser.messages[0].model_dump_json()
'{"author":{"role":"assistant","name":null},"content":[{}],"channel":"analysis","recipient":null,"content_type":null}'

We fix it by adding a custom serialization method in protocol.py. I filed a issue: openai/harmony#78

Test Plan

added unit tests, and ran locally successfully

server

 CUDA_VISIBLE_DEVICES=2,3 with-proxy vllm serve "/data/users/axia/checkpoints/gpt-oss-120b" -tp 2 --port 20001

client

curl http://localhost:20001/v1/responses   -H "Content-Type: application/json"   -N   -d '{
    "model": "/data/users/axia/checkpoints/gpt-oss-120b",
    "input": [
        {
            "role": "user",
            "content": "Hello."
        }
    ],
    "temperature": 0.7,
    "max_output_tokens": 256,
    "stream": true,
    "enable_response_messages": true
}'


...


event: response.completed
data: {"response":{"id":"resp_f6111e47923e423e9a21c35775af604c","created_at":1759515694,"incomplete_details":null,"instructions":null,"metadata":null,"model":"/data/users/axia/checkpoints/gpt-oss-120b","object":"response","output":[{"id":"rs_6f2b920ea35f4bbb99e6ca2cee9580de","summary":[],"type":"reasoning","content":[{"text":"We need to respond as ChatGPT, friendly greeting. Probably ask how can help.","type":"reasoning_text"}],"encrypted_content":null,"status":null},{"id":"msg_6b9222d5e23a46d4becc526cf101d085","content":[{"annotations":[],"text":"Hello! How can I assist you today?","type":"output_text","logprobs":null}],"role":"assistant","status":"completed","type":"message"}],"parallel_tool_calls":true,"temperature":0.7,"tool_choice":"auto","tools":[],"top_p":1.0,"background":false,"max_output_tokens":256,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"reasoning":null,"service_tier":"auto","status":"completed","text":null,"top_logprobs":null,"truncation":"disabled","usage":{"input_tokens":67,"input_tokens_details":{"cached_tokens":0},"output_tokens":36,"output_tokens_details":{"reasoning_tokens":18,"tool_output_tokens":0},"total_tokens":103},"user":null,"input_messages":[{"author":{"role":"system","name":null},"content":[{"model_identity":"You are ChatGPT, a large language model trained by OpenAI.","reasoning_effort":"Medium","conversation_start_date":"2025-10-03","knowledge_cutoff":"2024-06","channel_config":{"valid_channels":["analysis","final"],"channel_required":true},"tools":null}],"channel":null,"recipient":null,"content_type":null},{"author":{"role":"user","name":null},"content":[{"text":"Hello."}],"channel":null,"recipient":null,"content_type":null}],"output_messages":[{"author":{"role":"assistant","name":null},"content":[{"text":"We need to respond as ChatGPT, friendly greeting. Probably ask how can help."}],"channel":"analysis","recipient":null,"content_type":null},{"author":{"role":"assistant","name":null},"content":[{"text":"Hello! How can I assist you today?"}],"channel":"final","recipient":null,"content_type":null}]},"sequence_number":38,"type":"response.completed"}

^ note output_messages has content as text and not null (that was the previous bug).


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Andrew Xia <axia@meta.com>
@qandrew qandrew changed the title [responseAPI] fix serialize harmony initial commit [responseAPI][bugfix] serialize harmony messages Oct 3, 2025
@mergify mergify bot added frontend gpt-oss Related to GPT-OSS models labels Oct 3, 2025
Signed-off-by: Andrew Xia <axia@meta.com>
Signed-off-by: Andrew Xia <axia@meta.com>
@qandrew qandrew marked this pull request as ready for review October 3, 2025 18:27
@qandrew qandrew changed the title [responseAPI][bugfix] serialize harmony messages [responsesAPI][bugfix] serialize harmony messages Oct 3, 2025
@qandrew
Copy link
Copy Markdown
Contributor Author

qandrew commented Oct 3, 2025

cc @lacora , @houseroad , @yeqcharlotte , @alecsolder this is ready for review

qandrew and others added 4 commits October 3, 2025 16:47
Signed-off-by: Andrew Xia <axia@meta.com>
Signed-off-by: Andrew Xia <axia@meta.com>
Signed-off-by: Andrew Xia <axia@meta.com>
Copy link
Copy Markdown
Collaborator

@yeqcharlotte yeqcharlotte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for adding the tests

@github-project-automation github-project-automation bot moved this from To Triage to Ready in gpt-oss Issues & Enhancements Oct 7, 2025
@yeqcharlotte yeqcharlotte enabled auto-merge (squash) October 7, 2025 05:18
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 7, 2025
@yeqcharlotte yeqcharlotte merged commit 185d8ed into vllm-project:main Oct 7, 2025
49 checks passed
Comment on lines +2132 to +2141
serialized = []
for m in msgs:
if isinstance(m, dict):
serialized.append(m)
elif hasattr(m, "__dict__"):
serialized.append(m.to_dict())
else:
# fallback to pyandic dump
serialized.append(m.model_dump_json())
return serialized
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we could consolidate majority of the code (e.g. message to serialized_message mapping) to a separate function? It could allow us to

  1. replace list.append with a simple list comprehensive (with better readability and better performance)
  2. better individual unit tests.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to refactor it a bit in #26620

Dhruvilbhatt pushed a commit to Dhruvilbhatt/vllm that referenced this pull request Oct 14, 2025
Signed-off-by: Andrew Xia <axia@meta.com>
Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
Signed-off-by: Andrew Xia <axia@meta.com>
Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: Andrew Xia <axia@meta.com>
Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025
Signed-off-by: Andrew Xia <axia@meta.com>
Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025
Signed-off-by: Andrew Xia <axia@meta.com>
Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend gpt-oss Related to GPT-OSS models ready ONLY add when PR is ready to merge/full CI is needed

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants